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A defining feature of many large empirical networks is their intrinsic complexity. However, many 
networks also contain a large degree of structural repetition. An immediate question then arises: 
can we characterize essential network complexity while excluding structural redundancy? 

In this article we utilize inherent network symmetry to collapse all redundant information from a 
network, resulting in a coarse- graining which we show to carry the essential structural information of 
the 'parent' network. In the context of algebraic combinatorics, this coarse-graining is known as the 
quotient. We systematically explore the theoretical properties of network quotients and summarize 
key statistics of a variety of 'real-world' quotients with respect to those of their parent networks. 
In particular, we find that quotients can be substantially smaller than their parent networks yet 
typically preserve various key functional properties such as complexity (heterogeneity and hubs 
vertices) and communication (diameter and mean geodesic distance), suggesting that quotients 
constitute the essential structural skeleton of their parent network. We summarize with a discussion 
of potential uses of quotients in analysis of biological regulatory networks and ways in which using 
quotients can reduce the computational complexity of network algorithms. 

PACS numbers: 89.75.-k 89.75.Fb 05.40.-a 02.20.-a 



INTRODUCTION 

Many physical systems - from the world-wide web 
to scientific collaborations and biochemical reactions in- 
side cells - can be modeled as networks. The ubiquity 
of empirical networks has generated increasing interest 
in their study over the last decade during which much 
progress has been made toward elucidating general net- 
work organizational principles beyond the specific de- 
tails of individual systems 0, 0, 0, 0, 0> 0]- Structural 
properties which are commonly found in many disparate 
networks include: the 'small- world' property Q; the 
scale- free distribution of vertex degrees @; hierarchical 
modularity 0]; network construction from motifs H; as- 
sortative mixing Q; and self-similarity [l^] amongst oth- 
ers. Together, investigation of generic structural proper- 
ties such as these may be thought of as an attempt to 
understand network complexity 

In order to find simplicity in this complexity some au- 
thors have attempted to extract network 'skeletons': re- 
lated networks which capture essential structural features 
of the system from which they are derived, but are sim- 
pler in some quantitative way. Existing network skele- 
tons include for instance, the fractal skeleton [lij], which 
is responsible for fractal scaling; and the communication 
skeleton [l3j], which is responsible for the majority of com- 
munication flow through the network. Such skeletons are 



generally formed with respect to a given property, for 
example fractality or communication, and thus do not 
represent a structural skeleton in the strongest sense. In 
this article we propose an alternative skeleton which for- 
mally captures all essential structural information, and 
which can be significantly smaller than the original net- 
work from which it is derived. The method we use is 
based upon utilizing inherent network symmetry. 

Although almost all large random networks are 
asymmetric [3], many empirical real- world networks are 
surprisingly richly symmetric 15 , 3, 17 , 18| . This sym- 
metry commonly results from the presence of locally tree- 
like or biclique-like structures 0, [li} which are present 
in many empirical networks, and derive naturally from 
elementary growth processes such as growth with pref- 
erential attachment [l|| and growth with similar linkage 
pattern However, despite a rich abstract theory of 
graph symmetry 3, 20, 2M, 2^] , the symmetry structure 
of complex real- world networks has not yet been explored 
extensively. 

Intuitively, a network is symmetric if two or more of 
its vertices can be permuted without altering vertex ad- 
jacency. Symmetric networks therefore necessarily con- 
tain a certain degree structural redundancy in that they 
possess multiple vertices which play the same structural 
role. Thus network symmetry is strongly related to net- 
work redundancy.In this paper we use this relationship to 
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show how symmetry also provides a natural means to for- 
mally exclude redundancy while still preserving essential 
network structure, by factoring out structurally identical 
elements. 

The structure of the remainder of this paper is as fol- 
lows: first we introduce essential background material 
concerning network automorphism groups, and show how 
a network's automorphism group may be used to pro- 
duce a coarse-grained skeleton of the network called the 
quotient. We also introduce a variation on the classical 
quotient which we call the s-quotient. Then we show that 
quotients can be substantially smaller than the network 
from which they are derived and explore ways in which 
key structural properties are inherited by the quotient 
and the s-quotient from the 'parent' network. In par- 
ticular, we shall examine how network heterogeneity, de- 
gree distribution and communication properties are car- 
ried from the parent to its quotients. 



same orbit therefore possess many of the same structural 
properties, including the same degree, eigenvector cen- 
trality and clustering coefficient [15( (for more examples, 
see [2J])- We therefore say that vertices in the same 
group orbit are structurally equivalent. Since many real- 
world empirical networks possess a non-trivial automor- 
phism partition they therefore carry a significant amount 
of redundant information in which more than one vertex 
plays the same structural role. In addition to elucidating 
the precise nature of structural repetitions in a network, 
the automorphism partition also provides a convenient 
way to factor out these structural repetitions by 'glu- 
ing together' structurally equivalent vertices to create a 
coarse-graining of the network, known as the quotient. 



BACKGROUND AND DEFINITIONS 



Quotients 



Preliminaries 

Formally, a network is a graph G = (V, E) with vertex 
set V and edge set E in which two vertices are adjacent 
if there is an edge between them. An automorphism is 
a permutation of the vertices of the network which pre- 
serves adjacency, and the set of automorphisms under 
composition forms a group Aut(G). The automorphism 
group of a network compactly describes its symmetry 
structure. Automorphism groups can be efficiently calcu- 
lated with the use of an appropriate graph isomorphism 
algorithm such as the nauty algorithm[23j which we use 
in this study. We say that a network which possesses a 
nontrivial automorphism group is symmetric. Previous 
studies have highlighted the fact that many real-world 
networks possess nontrivial (and often quite large) auto- 
morphism groups [IS EE [13, El. 

The vertices of a symmetric network can be partitioned 
into disjoint equivalence classes called orbits: for every 
vertex v S V(G), v belongs to the orbit 

A(u) = {g ■ v £ V : g E Aut(G)}. 

We refer to the partitioning of the network vertices into 
disjoint orbits as the automorphism partition 17]. Note 
that since they can be permuted without altering net- 
work structure, two vertices in the same orbit are equiv- 
alent in the strongest possible structural sense: they play 
precisely the same structural role in the network and 
cannot be distinguished from one another by any mean- 
ingful structural measure (more formally, a vertex prop- 
erty which is preserved under isomorphism is known as 
a vertex invariant] vertices in the same orbit are indis- 
tinguishable by vertex invariants [24]). Vertices in the 



More formally, let A = {Ai, A 2 . . . , A s } be the auto- 
morphism partition of a network G. A si gnif icant prop- 
erty of this partition is that it is eauitable\22^: the num- 
ber of neighbors in Aj of a vertex v 6 Aj is a constant 
Qij {hj — 1, 2, . . • , s), which depends upon i and j but 
is independent of the choice of V G A<. The quotient 
Q of G under the action of Aut(G) is the multi-digraph 
with vertex set A and adjacency matrix qij. We refer 
to the network G as the parent of Q, and note that net- 
work quotients may be easily calculated using the nauty 
algorithm [13]. 

The quotient contains all the structural information 
of its parent network but, by associating structurally 
equivalent vertices, formally excludes all structural rep- 
etitions. Crucially, this means that many characteristic 
properties of the parent network are preserved in the quo- 
tient (any differences are due to the fact that the quotient 
only carries the unique structural features of its parent 
without repetitions). Consequently, while they are often 
very similar, it is the properties of the quotient, and not 
those of the parent network per se, that describe core 
system complexity. For this reason, the quotient may be 
thought of as the structural skeleton of its parent. 

In the context of algebraic graph theory IE IE 21 , 13] , 
certain properties of quotients are well-known including, 
for example, the fact that the eigenvalues of the quotient 
are a subset of those of its parent [22j]. However, previous 
studies of graph quotients have been largely mathemat- 
ical in nature, and have tended to focus on properties 
of quotients of completely regular graphs. An investiga- 
tion of the properties of quotients of real-world networks 
- which typically contain both regular and random ele- 
ments - has not as yet been undertaken. 
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S-Quotients 

As we noted above, quotients are generally multi- 
digraphs (that is, their edges are weighted and directed). 
This is the case even when the parent network is simple 
(that is, the edges are not weighted or directed). 

When a given network is a multi-digraph it is often 
convenient to consider properties of the simple underly- 
ing network, in which edge weights and directions are 
removed. Such underlying networks carry the adjacency 
information of the full network, and so retain many key 
network properties. Therefore as well as examining prop- 
erties of the quotient we shall also focus on properties of 
the simple underlying quotient (or s-quotient for short), 
denoted Qs, in which edge-directions, edge- weights and 
loops are removed from the quotient. Fig. [T] shows a net- 
work, its quotient and its s-quotient. The s-quotient has 
the advantage that it retains the adjacency information of 
the quotient, yet has a binary symmetric adjacency ma- 
trix and thus is more computationally efficient to work 
with. 



PROPERTIES OF QUOTIENTS 
Relative size 

Since quotients and s-quotients are formed by factoring 
out network redundancy they can be significantly smaller 
than their parent networks. Table[T]shows that many em- 
pirical s-quotients are less than 50% the size of their par- 
ent network, illustrating that much real-world network 
structure is due to repetition of structurally identical el- 
ements. 

In order to investigate relative sizes of s-quotients we 
examined the correlation between various measures of 
network symmetry and the ratio of the size of the s- 
quotient to that of its parent (the reduction ratio). We 
used two different indicies to quantify network symme- 
try: (1) Pq, the nth root of the ratio of automorphism 
group size to that of the (maximally symmetric) complete 
graph of the same sizefla. [lij: 



= / |Aut(G)| \ 



l/N 



and (2) jq, the ratio of the number of vertices in 
non-trivial orbits to N, the number of vertices in the 
network fHfll: 



7G 



Ei 



Ai|>l 



N 



We define the size of a network as \G\ = N + M, where 
M is the number of edges in G. The quotient reduction 
ratio is defined as tq = \Qg\j\G\. 



Fig. [2] shows the correlation between these two mea- 
sures of symmetry and the quotient reduction ratio for 
eleven representative real- world networks (further details 
of these networks are given in Table |TJ . The correlation 
coefficient between tq and @c is -0.7567; the correlation 
coefficient between tq and jg is -0.9767, illustrating that 
the degree of symmetry and the relative size of the s- 
quotient are strongly negatively correlated over a variety 
of networks. 



Heterogeneity 

Network heterogeneity - that is, the degree to which 
different vertices play different roles or possess different 
properties in a network - is important in determining 
many dynamic network properties such as robustness [25j 
and synchronization[26j. A completely heterogeneous 
network is one in which all vertices play a unique struc- 
tural role (that is, the network has a trivial automor- 
phism group), while a completely homogeneous network 
is one in which all vertices play the same structural 
role (that is, the network has a transitive automorphism 
group) [13] • Since structurally equivalent elements are re- 
moved in the quotient while structurally non-equivalent 
elements are preserved it is immediate that network quo- 
tients are completely heterogeneous: all vertices in the 
quotient play a different structural role (see the quotient 
in Fig. rrjfor example). However, since edge weights, di- 
rections and loops are removed in the s-quotient, some 
vertices may still play the same structural role (for exam- 
ple in the s-quotient in Fig. [T]the red and white vertices 
are structurally equivalent; as are the yellow and black 
vertices; as are the green and purple vertices). Thus, 
although s-quotients may not be completely heteroge- 
neous we expect that they will be more heterogeneous 
than their parent networks. 

In order to assess network heterogeneity we used two 
distinct measures: degree-based entropy [27j Hd(G) and 
symmetry-based entropy fl7j H S (G). These two entropies 
have a common algebraic form: 



H d , s {G) = -X>1°S 



Pi 



where pi is the probability that a vertex has degree i 
when calculating Hd(G); and pi is the probability that 
v £ Ai when calculating H S (G). In order to compare 
networks of different sizes we normalized these measures 
as follows: 



Hd, s (G) 



H dt3 (G)-min(H dia ,N) 
max(H dtS , N) - mm(H d , s , N) ' 



where max^^.s, N) and min(i?d jS , N) are the maximal 
and minimal entropy values for a network with N ver- 
tices respectively. Fig. [3] summarizes these two entropy 
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FIG. 1: Networks and their quotients (a) A hypothetical network; (b) its quotient and (c) its s-quotient. (d) A real-world 
example: the network of ties between Ph.D. students and their advisers in theoretical computer science Each edge links 
an adviser to a student. Vertices in the same orbit are the same color, (e) The s-quotient of the theoretical computer science 
network: vertices are colored as the orbits in the parent network. There are 1025 vertices in the parent and only 511 in the 
s-quotient (a reduction of 50.15%); similarly there are 1043 edges in the parent and only 525 in the s-quotient (a reduction of 
49.67%). Emprical networks were visualized using PajekHi]. 



TABLE I: Statistics for representative networks and their s-quotients. Summarized statistics for each network are: the 
number of nodes JV; the number of edges M; the mean vertex degree Z; assortative mixing coefficient r[jj; the mean geodesic 
distance m; the diameter D; and the clustering coefficient C[5[. The subscript s indicates quantities for s-quotients. We also 
calculate the ratio of N s to N; the ratio of M s to M; and the ratio of M — M s to N — N s , denoted z. In all cases, we consider 
properties of the underlying graph of largest connected component of the parent networks. Except for PPI, InternetAS and 
Homo, all network data can be downloaded from http : //vlado . fmf .uni-lj .si/pub/networks/data/. 
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measures for the 11 empirical networks in Table |H As ex- 
pected in all cases the s-quotient is more heterogeneous 
than its parent indicating that structural features which 
contribute to network homogeneity are factored out in 
the s-quotient while structural features which contribute 
to network heterogeneity are preserved in the s-quotient. 



Vertex degree distributions 

A quotient's vertex degree distribution is strongly re- 
lated to that of its parent. Recall that all vertices in the 
same orbit have the same degree jljjj. Thus, n out (k, Q) — 
Ok, where Ot is the number of orbits of degree k in G and 
n ou _t{k, Q) is the number of vertices in Q with out-degree 
k. We may think of the vertex out-degree distribution 
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FIG. 2: Network symmetry and s-quotient size are in- 
versely correlated. 
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FIG. 3: Heterogeneity of empirical networks and their 
s-quotients. The horizontal axis shows the ratio of het- 
erogeneity in the s-quotient to that of its parent, defined as 
HdAQs)/Hd, s (G)-l 



in the quotient as being formed by measuring the degree 
of one representative vertex from each orbit in G. Thus, 
the quotient vertex out-degree distribution represents the 
'essential' vertex degree distribution of its parent and is 
dependent upon both the vertex degree distribution and 
the symmetry structure of its parent. 

Hub vertices (those with high degree) often dominate 
real-world network topology, and consequently crucially 
affect network properties such as robustness |25| and traf- 
fic along geodesies [41|. Hence, in order to accurately pre- 
serve network properties, quotients should preserve hub 
vertices. Since they generally connect many disparate 
regions of a network, hub vertices are more likely to be 
fixed by the automorphism group than are vertices of low 
degree, and consequently we expect that generically this 
is indeed the case. 

Much of the symmetry present in many real world net- 
works is due to the presence of bicliquesflrj Il8j and in 



t Diciiq 

particular, the presence of stars IU 16| (a fc-star is a sub- 
graph consisting of a central vertex of degree > k adja- 
cent to k vertices of degree 1). In /c-stars, the k vertices 
of degree 1 are structurally equivalent to each other and 



collapse to a single vertex in the quotient. Thus, each k- 
star reduces the vertex order of the quotient by k — 1. Fig 
[T] shows how a 3-star (in white on the left) collapses to a 
single vertex in the quotient and s-quotient. In networks 
which contain a significant number of bicliques or stars, 
the s-quotient is formed largely by 'pruning' appropriate 
vertices of small degree from the parent network while 
fixing hubs. 

In order to assess the degree to which hub vertices are 
preserved in quotients, and the degree to which quotients 
are formed by pruning vertices of low degree, we investi- 
gated the degree distributions of those vertices that have 
been factored out in s-quotients for a variety of real- world 
networks. In particular, we considered two distinct quan- 
tities: (1) P k the number of vertices of degree k factored 
out in the s-quotient as a percentage of the total number 
of vertices in the parent network: 



Pa. 



N k -O k 



x 100%, 



where N k is the number of vertices with degree k and O k 
is the number of orbits in which each vertex has degree 
k; and (2) R k the number of vertices of degree k factored 
out in the s-quotient as a percentage of the total number 
of vertices factored out: 



N k -O k 
JV-IAI 



x 100%. 



Fig. [4] shows that generally in empirical networks only 
vertices with small degree are factored out in the s- 
quotient (in all tested cases the maximum degree of 
any factored vertex was 29); vertices with higher degree 
tended to be fixed by the automorphism group and thus 
retained in the s-quotient. 

In order to identify the proportion of vertices which are 
factored out by degree we considered two further quanti- 
ties. Let d(v) be the degree of a vertex v. Consider now 
the total network degree-set: Deg = {d(v)\v S G} (i.e. 
the set of all vertex degrees), and the nontrivial orbit 
degree-set: Deg' = {d(v)\v £ A,-, |A,| > 1} (i.e the set of 
degrees of those vertices in nontrivial orbits). Note that 
Deg' C Deg. We define two quantities based upon these 
sets: 



/' 



Peg 7 1 
|Deg| 



x 100%, 



the percentage of the degree-set factored out in the 
quotient, and 



s- 



v = x 100 %, 

max(Deg) 

the maximum vertex degree factored out in the s-quotient 
as a percentage of the maximum vertex degree in the par- 
ent network. Fig. 2]shows these measures for 6 real-world 
networks. It is clear that vertices which are factored out 
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FIG. 5: Mean geodesic distances in s-quotients. The 

ratio (m s /m) — 1 is plotted, where m a and m are the mean 
geodesic distances in the s-quotient and its parent respec- 
tively. 



FIG. 4: Factored degree distributions in s-quotients. 

The symbol O gives the reduction ratio Pk\ the symbol □ 
gives the reduction ratio Rk- 



in the s-quotient constitute only a minority of the whole 
network-degree set (the maximum value for \x we found 
was 26.51%); and that only vertices with relatively low 
degree are factored out (the maximum value for v we 
found was 20.59%). Furthermore, Table|T]also shows that 
it is common for s-quotients to have an average degree 
larger than that of their parent, demonstrating that ver- 
tices of small degree are more likely to lie in a non-trivial 
orbits (and thus be factored out in the quotient) than are 
hub vertices (which are generically retained). 



Communication properties 

Many empirical large complex networks are 'small- 
world' meaning that there exists a relatively short path 
between any two vertices in the network^. The shortest 
path between a pair of vertices is known as a geodesic 
and the length of the longest geodesic is known as the 
diameter of the network, which we denote D(G). Distri- 
bution of geodesic distances and network diameter both 
significantly effect dynamic network properties such as 
information transfer |13j and tolerance to attack pjjj. 

Table [J shows the network diameter and s-quotient di- 
ameter for a variety of empirical networks. In all cases, 
network diameter is preserved exactly in the s-quotient. 
For example, in the Eva network, a telecommunications 
and media ownership network 3J] , the vertex and edge 
numbers of the s-quotient are 20% and 22.7% that of 
original network respectively, yet network diameter is 
maintained in the s-quotient . In this case the s-quotient 
is substantially smaller than its parent, yet it preserves 
the communication properties of its parent. In fact, this 
empirical observation is true for all 'locally-symmetric' 
networks. 



Intuitively, a network is globally- symmetric if the 
longest geodesic is between vertices in the same orbit 
(that is, there are automorphisms which permute dis- 
tant vertices) ; otherwise the network is locally symmetric 
(that is, all automorphisms act on local vertex subsets). 
Since many real- world networks are commonly subject to 
continuous stochastic fluctuations in topology, we do not 
expect - neither did we find - that any large real-world 
networks are globally symmetric. 

The s-quotient describes the orbit adjacency structure 
of its parent network. Thus, network diameter is exactly 
preserved in the s-quotient as long as the parent network 
is not globally symmetric. For an illustration of this see 
the network shown in Fig. 1(a) In this network the 



longest geodesic is between any of the red vertices on 
the right and any of the white vertices on the left, and 
the network has diameter = 5. The s-quotient of this 
network (shown in Fig. 1(c)) also has diameter = 5, 



and diameter is preserved in the s-quotient since orbit 
adjacency is preserved. 

While the diameter of a network is related to the max- 
imum information transfer cost in the network, mean 
geodesic distance is related to the average transit cost. 
Empirical measurements show that the disparity between 
mean geodesic distance in the s-quotient and its parent 
is usually quite small. As shown in Figure[5]for all tested 
networks the mean geodesic distance of the s-quotient is 
within 10% of that of its parent network irrespective of 
the relative size of the s-quotient to its parent. Since 
both network diameter and mean geodesic distance are 
robustly inherited by the s-quotient from its parent, we 
conclude that the s-quotient forms the communication 
skeleton of its parent. 



CONCLUSIONS 

The quotient of a network is formed by associating 
structurally equivalent vertices into disjoint equivalence 
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classes and considering adjacency relationships between 
these equivalence classes. Thus quotients capture all es- 
sential network complexity, yet formally exclude all struc- 
tural redundancy. Quotients may therefore be thought of 
as the structural skeletons of the systems from which they 
are derived. Consequently properties of the quotient, and 
not those of the parent network per se, describe core sys- 
tem complexity. Observation of the statistics of real- 
world networks verifies that elements which contribute 
to network homogeneity (or simplicity) are removed in 
network quotients; while the elements which contribute 
to the heterogeneity (or complexity) are completely re- 
tained in quotients. 

Many biological networks are thought to form 
by growth with vertex duplication, or partial 
duplication [13]. Vertex duplication is useful in a 
biological context since it naturally endows biological 
regulatory systems with functional redundancy, thus 
reinforcing against damage. Quotients of biological 
regulatory networks therefore encode core relationships 
between biochemical control motifs, minus any repeti- 
tions due to redundancy. While the large-scale properties 
of analogous biological regulatory networks are often 
remarkably similar across a broad range of species, their 
detailed properties can differ significantly 3a 43 1. Thus 
it may be of particular interest to explore the similarities 
between structural properties of quotients of regulatory 
networks for various different species, since this could 
provide a new means to analyze functional conservation 
of regulatory motifs across species. 

Finally, since quotients carry the structure of their 
parents, yet are often substantially smaller, performing 
analysis directly on quotients, rather than on the corre- 
sponding parent networks, can reduce the complexity of 
network algorithms. For example, average shortest path 
length computation time can be reduced from <d(NM) 
to SI^nTmNM) if calculated on the s-quotient (where 
r N = N s /N and r M = M s /M). 

We anticipate that further investigation of properties 
of network quotients will be both of theoretical and prag- 
matic interest. 
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